Combining Committee-Based Semi-supervised and Active Learning and Its Application to Handwritten Digits Recognition

نویسندگان

  • Mohamed Farouk Abdel Hady
  • Friedhelm Schwenker
چکیده

Semi-supervised learning reduces the cost of labeling the training data of a supervised learning algorithm through using unlabeled data together with labeled data to improve the performance. Co-Training is a popular semi-supervised learning algorithm, that requires multiple redundant and independent sets of features (views). In many real-world application domains, this requirement can not be satisfied. In this paper, a single-view variant of Co-Training, CoBC (Co-Training by Committee), is proposed, which requires an ensemble of diverse classifiers instead of the redundant and independent views. Then we introduce two new learning algorithms, QBC-then-CoBC and QBC-with-CoBC, which combines the merits of committee-based semi-supervised learning and committeebased active learning. An empirical study on handwritten digit recognition is conducted where the random subspace method (RSM) is used to create ensembles of diverse C4.5 decision trees. Experiments show that these two combinations outperform the other non committee-based ones.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Active graph based semi-supervised learning using image matching: Application to handwritten digit recognition

With the availability of large amounts of documents and multimedia content to be classified, the creation of new databases with labeled examples is an expensive task. Efficient supervised classifiers often require large training databases that are not always immediately available. Active learning approaches solve this issue by querying an expert to set a label to particular instances. In this p...

متن کامل

Semi-supervised learning for character recognition in historical archive documents

Training recognizers for handwritten characters is still a very time consuming task involving tremendous amounts of manual annotations by experts. In this paper we present semi-supervised labeling strategies that are able to considerably reduce the human effort. We propose two different methods to label and later recognize characters in collections of historical archive documents. The first one...

متن کامل

Persian Handwritten Digit Recognition Using Particle Swarm Probabilistic Neural Network

Handwritten digit recognition can be categorized as a classification problem. Probabilistic Neural Network (PNN) is one of the most effective and useful classifiers, which works based on Bayesian rule. In this paper, in order to recognize Persian (Farsi) handwritten digit recognition, a combination of intelligent clustering method and PNN has been utilized. Hoda database, which includes 80000 P...

متن کامل

Neural Network Based Recognition System Integrating Feature Extraction and Classification for English Handwritten

Handwriting recognition has been one of the active and challenging research areas in the field of image processing and pattern recognition. It has numerous applications that includes, reading aid for blind, bank cheques and conversion of any hand written document into structural text form. Neural Network (NN) with its inherent learning ability offers promising solutions for handwritten characte...

متن کامل

Off-line Arabic Handwritten Recognition Using a Novel Hybrid HMM-DNN Model

In order to facilitate the entry of data into the computer and its digitalization, automatic recognition of printed texts and manuscripts is one of the considerable aid to many applications. Research on automatic document recognition started decades ago with the recognition of isolated digits and letters, and today, due to advancements in machine learning methods, efforts are being made to iden...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010